Analytical memory bandwidth model for many-core processor based systems
نویسندگان
چکیده
Many-core processor based systems gain popularity in high-performance parallel embedded applications. Estimating memory bandwidth requirement, i.e. external memory bandwidth, given various cache size for target parallel applications requires a prohibitively large simulation time. In this work, we propose an analytical model to quickly estimate the memory bandwidth for a given cache size and help exploring trade-offs between cache sizes and memory bandwidth requirement. We model the stochastic behavior of cache misses for a single cache as a random process. Using central limit theorems for identically or non-identically distributed random processes, we accurately estimate the collective cache misses from hundreds of processor cores and thus the total memory bandwidth requirement for the whole system. The results show that our model improves a speed of simulation time up to 200.4 times for 200 cores whereas its estimated results achieve less than 0.01% difference from the simulated ones for 200 cores in terms of accuracy.
منابع مشابه
Ultra-Low-Energy DSP Processor Design for Many-Core Parallel Applications
Background and Objectives: Digital signal processors are widely used in energy constrained applications in which battery lifetime is a critical concern. Accordingly, designing ultra-low-energy processors is a major concern. In this work and in the first step, we propose a sub-threshold DSP processor. Methods: As our baseline architecture, we use a modified version of an existing ultra-low-power...
متن کاملJoint Exploration of Hardware Prefetching and Bandwidth Partitioning in Chip Multiprocessors
In this paper, we propose an analytical model-based study to investigate how hardware prefetching and memory bandwidth partitioning impact Chip Multi-Processors (CMP) system performance and how they interact. The model includes a composite prefetching metric that can help determine under which conditions prefetching can improve system performance, a bandwidth partitioning model that takes into ...
متن کاملMPI communication on MPPA Many-core NoC: design, modeling and performance issues
Power dissipation and energy consumption has become a major issue for high performance computing and embedded systems. Keeping up with the performance trend of the last decades cannot be achieved anymore by stepping up the clock speed of processors. The usual strategy is nowadays to use lower frequency and to increase the number of cores. On such recent systems, data communication and memory ba...
متن کاملAsymmetries in Multi-Core Systems – Or Why We Need Better Performance Measurement Units
Future exascale systems will be based on multi-core processors, but even today’s multi-core processors can be asymmetric and exhibit limitations and bottlenecks that are different from those found on a symmetric multiprocessor. In this paper we investigate the performance of a cluster node based on the Intel Xeon E5345 quad-core processor and note that despite the symmetry implied by the progra...
متن کاملOn Memory Contention Problems in Vector Multiprocessors
Memory interleaving considerably increases memory bandwidth in vector processor systems. The concurrent operation of the processors can produce memory bank connicts and hence alter the memory band-width. Total or steady state performance for vector operations in a memory system is studied. Many methods of resolving memory bank connicts are proposed and compared. Analytical results on the result...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- IEICE Electronic Express
دوره 9 شماره
صفحات -
تاریخ انتشار 2012